Optimization for Distributed Committee Machines in The Knowledge Discovery in Distributed Databases Process
نویسندگان
چکیده
The knowledge discovery in distributed databases represents the overall process of obtaining useful information from data that is stored in a replication topology on multiple computing systems. Distributed committee machines are formed by many neural networks that work in a distributed manner on multiple computing systems for resolving one data mining task. These architectures are the most suitable neural structures for working in a replication topology. In my case I used multilayer perceptrons trained with the backpropagation algorithm for resolving the classification task. In this paper I propose an optimized way of functioning for the entire distributed committee machine architecture by choosing the appropriate database engine, the best type of replication and by reducing the writing operations into the distributed database as much as possible. All the experiments were done on large data sets in order to achieve the best results. This study is useful in research areas such as distributed learning, adaptive eLearning, machine learning, medical diagnosis, astronomy and many others.
منابع مشابه
Economic Lot Sizing and Scheduling in Distributed Permutation Flow Shops
This paper addresses a new mixed integer nonlinear and linear mathematical programming economic lot sizing and scheduling problem in distributed permutation flow shop problem with number of identical factories and machines. Different products must be distributed between the factories and then assignment of products to factories and sequencing of the products assigned to each factory has to be d...
متن کاملOptimization of majority protocol for controlling transactions concurrency in distributed databases by multi-agent systems
In this paper, we propose a new concurrency control algorithm based on multi-agent systems which is an extension of majority protocol. Then, we suggest a clustering approach to get better results in reliability, decreasing message passing and algorithm’s runtime. Here, we consider n different transactions working on non-conflict data items. Considering execution efficiency of some different...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملApplication of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)
Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...
متن کاملA Distributed Algorithm for Mining Fuzzy Association Rules
Data mining, also known as knowledge discovery in databases, is the process of discovery potentially useful, hidden knowledge or relations among data from large databases. An important topic in data mining research is concerned with the discovery of association rules. The majority of databases are distributed nowadays. In this paper is presented an algorithm for mining fuzzy association rules f...
متن کامل